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Abstract 

In the paper, it is discussed by using Monte-Carlo simulation that the Bayesian Neural 
Network(BNN) is applied to determine neutrino incoming direction in reactor neutrino 
experiments and supernova explosion location by scintillator detectors. As a result, 
compared to the method in Ref.[l], the uncertainty on the measurement of the neutrino 
direction using BNN is significantly improved. The uncertainty on the measurement of 
the reactor neutrino direction is about 1.0° at the 68.3% C.L., and the one in the case of 
supernova neutrino is about 0.6° at the 68.3% C.L.. Compared to the method in Ref.[l], 
the uncertainty attainable by using BNN reduces by a factor of about 20. And compared 
to the Super-Kamiokande experiment (SK), it reduces by a factor of about 8. 

Keywords: Bayesian neural network, neutrino incoming direction, reactor neutrino, su- 
pernova neutrino 

PACS numbers: 07.05. Mh, 29.85.Fj, 14.60.Pq, 95.85.Ry 
1 Introduction 

The location of a z/ source is very important to study galactic supernova explo- 
sion. The determination of neutrino incoming direction can be used to locate a 
supernova, especially, if the supernova is not optically visible. The method based 
on the inverse P decay, z7g + p ^ e+ + has been discussed in the Ref.[l]. The 
method can be applied to determine a reactor neutrino direction and a super- 
nova neutrino direction. But the uncertainty of location of the z/ source attainable 
by using the method is not small enough and almost 2 times as large as that in 
the Super-Kamiokande experiment (SK). So we try to apply the Bayesian neural 
network(BNN)[2] to locate z/ sources in order to decrease the uncertainty on the 
measurement of the neutrino incoming direction. 
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BNN is an algorithm of the neural networks trained by Bayesian statistics. It 
is not only a non-linear function as neural networks, but also controls model com- 
plexity. So its flexibility makes it possible to discover more general relationships 
in data than the traditional statistical methods and its preferring simple models 
make it possible to solve the over-fitting problem better than the general neural 
networks[3]. BNN has been used to particle identification and event reconstruction 
in the experiments of the high energy physics, such as Ref.[4, 5, 6, 7]. 

In this paper, it is discussed by using Monte-Carlo simulation that the method 
of BNN is applied to determine neutrino incoming direction in reactor neutrino 
experiments and supernova explosion location by scintillator detectors. 

2 Regression with BNN[2, 6] 

The idea of BNN is to regard the process of training a neural network as a Bayesian 
inference. Bayes' theorem is used to assign a posterior density to each point, 9, 
in the parameter space of the neural networks. Each point 6 denotes a neural 
network. In the method of BNN, one performs a weighted average over all points 
in the parameter space of the neural network, that is, all neural networks. The 
methods make use of training data {{xi,ti), {x2,t2):---:{^n,'tn)}, where ti is the 
known target value associated with data Xi, which has P components if there are 
P input values in the regression. That is the set of data x ={xi,X2,...,Xn) which 
corresponds to the set of target t ={ti,t2,...,tn)- The posterior density assigned to 
the point 9, that is, to a neural network, is given by Bayes' theorem 



. , p(x,t\9)p(9) p(t\x,9)p(x\9)p(9) p(t\x,9)p(9) 

p(9\x,t) ^ ^ = ^ /, C , ( ^ ^ = ^ — , I X 
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where data x do not depend on 9, so p{x \ 9) = p (x) . We need the likelihood 
p (t \ X, 9^ and the prior density p (9^ , in order to assign the posterior density 

p (9 \ x, t^to a neural network defined by the point 9. p{t \ x) is called evidence 
and plays the role of a normalizing constant, so we ignore the evidence. That is. 



Posterior oc Likelihood x Prior (2) 
We consider a class of neural networks defined by the function 



H / p \ 

y(x,9^ =b + ^ Vjsin I Qj + XI ^ij^i J (3) 
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The neural networks have P inputs, a single hidden layer of H hidden nodes and 
one output. In the particular BNN described here, each neural network has the 
same structure. The parameter and Vj are called the weights and Oj and h are 
called the biases. Both sets of parameters are generally referred to collectively as 
the weights of the BNN, 9. y [x, 9^ is the predicted target value. We assume that 
the noise on target values can be modeled by the Gaussian distribution. So the 
likelihood of n training events is 



p(t\x,e)=ll exp[-{{ti - y (xi, ey/2a^] = exp[- ^^ik - V {x,, 9) /2a^)] (4) 

1=1 i=l 

where U is the target value, and a is the standard deviation of the noise. It has 
been assumed that the events are independent with each other. Then, the 
likelihood of the predicted target value is computed by Eq. (4). 

We get the likelihood, meanwhile we need the prior to compute the posterior 
density. But the choice of prior is not obvious. However, experience suggests 
a reasonable class is the priors of Gaussian class centered at zero, which prefers 
smaller rather than larger weights, because smaller weights yield smoother fits 
to data . In the paper, a Gaussian prior is specified for each weight using the 
BNN package of Radford Neal^. However, the variance for weights belonging to 
a given group (either input-to-hidden weights (Mjj), hidden -biases (a^), hidden-to- 
output weights(fj) or output-biases(6)) is chosen to be the same: u^, a,^, a^, 
al, respectively. However, since we don't know, a priori, what these variances 
should be, their values are allowed to vary over a large range, while favoring small 
variances. This is done by assigning each variance a gamma prior 



where z — cr"^, and with the mean // and shape parameter a set to some fixed 
plausible values. The gamma prior is referred to as a hyperprior and the 
parameter of the hyperprior is called a hyperparameter. 

Then, the posterior density, p{9 \ x^t}j, is gotten according to Eqs. (2). (4) and 
the prior of Gaussian distribution. Given an event with data x\ an estimate of 
the target value is given by the weighted average 



y {x'\x, t) ^ j y (x', e)p(e\ x, t) de (6) 

Currently, the only way to perform the high dimensional integral in Eq. (6) is to 
sample the density p(6 \ x, with the Markov Chain Monte Carlo (MCMC) 

method[2, 8, 9, 10]. In the MCMC method, one steps through the 9 parameter 
space in such a way that points are visited with a probability proportional to the 
posterior density, p (^6 \ x,tj. Points where p(^6 \ x, tj is large will be visited more 

often than points where p (^6 \ x, tj is small. 

Eq. (6) approximates the integral using the average 



y{x'\x,t)^-J2y(x',ei) (7) 
^ i=i 

where L is the number of points 9 sampled from p (9 \ x,t^. Each point 9 

corresponds to a different neural network with the same structure. So the 
average is an average over neural networks, and is closer to the real value of 
y {x' I x,t), when L is sufficiently large. 

^ R. M. Neal, Software for Flexible Bayesian Modeling and Markov Chain Sampling, 
http:/ /www. cs.utoronto.ca/~radford/fbm. software. html 



3 Toy Detector and Simulation[5] 

In the paper, a toy detector is designed to simulate the central detector in the re- 
actor neutrino experiment, such as Daya Bay experiment [11] and Double CHOOZ 
experiment[12], with CERN GEANT4 package[13]. The toy detector consists of 
three regions, and they are the Gd-doped liquid scintillator (Gd-LS from now on), 
the normal liquid scintillator(LS from now on) and the oil buffer, respectively. The 
toy detector of cylindrical shape like the detector modules of Daya Bay experiment 
and Double CHOOZ experiment is designed in the paper. The diameter of the 
Gd-LS region is 2.4 meter, and its height is 2.6 meter. The thickness of the LS 
region is 0.35 meter, and the thickness of the oil part is 0.40 meter. In the paper, 
the Gd-LS and LS are the same as the scintillator adopted by the proposal of the 
CHOOZ experiment [14]. The 8-inch photomultiplier tubes (PMT from now on) 
are mounted on the inside the oil region of the detector. A total of 366 PMTs are 
arranged in 8 rings of 30 PMTs on the lateral surface of the oil region, and in 5 
rings of 24, 18, 12, 6, 3 PMTs on the top and bottom caps. 

The response of the neutrino and background events deposited in the toy de- 
tector is simulated with GEANT4. Although the physical properties of the scintil- 
lator and the oil (their optical attenuation length, refractive index and so on) are 
wave-length dependent, only averages[14] (such as the optical attenuation length 
of Gd-LS with a uniform value is 8 meter and the one of LS is 20 meter) are 
used in the detector simulation. The program couldn't simulate the real detector 
response, but this won't affect the result of the comparison between the BNN and 
the method in the Ref.[l]. 

4 Event Reconstruction[5] 

The task of the event reconstruction in the reactor neutrino experiments is to 
reconstruct the energy and the vertex of a signal. The maximum likelihood method 
(MLD) is a standard algorithm of the event reconstruction in the reactor neutrino 
experiments. The likelihood is defined as the joint Poisson probability of observing 
a measured distribution of photoelectrons over the all PMTs for given {E, af ) 
coordinates in the detector. The Ref.[15] for the work of the CHOOZ experiment 
shows the method of the reconstruction in detail. 

In the paper, the event reconstruction with the MLD are performed in the 
similar way with the CHOOZ experiment [15], but the detector is different from 
the detector of the CHOOZ experiment, so compared to Ref.[15], there are some 
different points in the paper: 

(1) The detector in the paper consists of three regions, so the path length from 
a signal vertex to the PMTs consist of three parts, and they are the path length 
in Gd-LS region, the one in LS region, and the one in oil region, respectively. 

(2) Considered that not all PMTs in the detector can receive photoelectrons 
when a electron is deposited in the detector, the equation is modified in 
the paper and different from the one in the CHOOZ experiment, that is, — 
J2N =o^j + Y.N-^oi^i ^ + ^j^^fi'(^))) where Nj is the number of photoelec- 
trons received by the j-th PMT and Nj is the expected one for the j-th PMT[15]. 

(3) X Ntotai and the coordinates of the charge center of gravity for the all 
visible photoelectrons from a signal are regarded as the starting values for the fit 
parameters (£^, "^), where Ntotai is the total numbers of the visible photoelectrons 
from a signal and Ce is the proportionality constant of the energy E, that is. 



E = ce y< Ntotai- ce is obtained through fitting Ntotai^ of the 1 MeV electron 
events, and is 235/Mey paper. 



5 Monte-Carlo Sample 

5.1 Monte-Carlo Sample for Reactor Neutrinos 

According to the anti-neutrino interaction in the detector of the reactor neutrino 
experiments[16], the neutrino events from the random direction and the particular 
direction, (0.433,0.75,-0.5), are generated imiformly throughout GD-LS region of 
the toy detector. Fig. 1 shows the four important physics quantities of the Monte- 
Carlo reactor neutrino events and they are £'e+) -E'n,Ate+n) "^e+n) respectively. The 
selections of the neutrino events are as follows: 

(1) Positron energy: 1.3 MeV < £^e+ < 8 MeV; 

(2) Neutron energy: 6 MeV < < 10 MeV; 

(3) Neutron delay: 2 /is < Atg+n < 100 /is; 

(4) Relative positron-neutron distance: d^+n < 100 cm. 

10000 events from the random directions and 5000 events from (0.433,0.75,- 
0.5) are selected according to the above criteria, respectively. The events from the 
random direction are regarded as the training sample of BNN, and the events from 
(0.433,0.75,-0.5) are regarded as the test sample of BNN. 



5.2 Monte-Carlo sample for Supernova Neutrinos 

The neutrino events for the random direction and the particular direction, (0.354,0.612, 
0.707), are generated uniformly throughout GD-LS region of a liquid scintillator 
detector with the same geometry and the same target as the toy detector in the 
sec. 3, according to the following supernova i7e energy distribution[l, 17]: 

dN E^ 

dE ^ ^TT^ 

with T — S.SMeV and the supernova is considered to be at lOKpc. The number 
of the fixed direction neutrino events, for a supernova at lOKpc, could be 
detected in a liquid scintillator experiment with mass equal to that of SK[1]. The 
events from the random direction are regarded as the training sample of BNN, 
and the events from (0.354,0.612,-0.707) are regarded as the test sample of BNN. 
Fig. 2 shows the four important physics quantities of the Monte-Carlo supernova 
neutrino events and they are Ee+, En,A.te+n, de+n, respectively. 



6 Location of the neutrino source using the method in the 
Ref.[l] 

The inverse-/? decay can be used to locate the neutrino source in scintillator de- 
tector experiments. The method is based on the neutron boost in the forward di- 
rection. And neutron retains a memory of the neutrino source direction. The unit 
vector Xg+n, having its origin at the positron reconstructed position and pointing 
to the captured neutron position, is defined for each neutrino event. The distribu- 
tion of the projection of this vector along the known neutrino direction is forward 
peaked , but its r.m.s. value is not far from that of a flat distribution (cr/^at = l/\/3). 
p is deflned as the average of vectors X^+n, that is 



(9) 



The measured neutrino direction is the direction of p. 

The neutrino direction lies along the z axis is assumed to evaluate the uncer- 
tainty in the direction of p. From the central limit theorem p follows that the 
distribution of the three components is Gaussian with a — l/VSN centered at 
(0,0, Therefore, the uncertainty on the measurement of the neutrino direction 
can be given as the cone around p which contains 68.3% of the integral of this 
distribution. 



7 Location of the neutrino source using BNN 

In the paper, the x,y,z components of the neutrino incoming direction are predicted 
by the three BNNs, respectively. The BNNs have the input layer of 6 inputs, the 
single hidden layer of 15 nodes and the output layer of a output. Here we will 
explain the case of predicting the x component of the neutrino incoming direction 
in detail: 

(1) The data format for the training sample is di, fi, Ee+, EnjAte+mde+mU 
(i=x), where di is the difference of Vi and Ui (i=x). Vi{i=x) is the x components 
of the Xg+n in the section 6. nj(i=x) is the x component of the known neutrino 
incoming direction (n). /j(i=x) is the x component of the reconstructed positron 
position, di, fi, E^+, En,Ate+n,de+n 3.re used as inputs to a BNN, and ti is the 
known target. The target can be obtained by Eq. 10. That is 

ti^ r r^ii^x)- (10) 

where 

(2) The inputs of the test sample are similar with that of the train sample, but 
the dj(i=x) is different from that of the training sample. The p obtained by the 
method in the section 6 is substituted for the known neutrino incoming direction in 
the process of computing (ij(i=x). The t]9i(i=x) is the output of the BNN, that is, 
it is the predicted value using the BNN. We make use of the tpi value to compute 
the X component of neutrino incoming direction via the following equation (In fact, 
Eq. 11 is the inverse-function of Eq. 10.): 

iTT'i — 1—, — ; zi'i — x), (11) 

where Vi{i=x) is the x component of the Xg+n- ?^i(i=x) is just the x component 
of the direction vector (m) predicted by the BNN. 

A Markov chain of neural networks is generated using the BNN package of 
Radford Neal, with the training sample, in the process of predicting the x compo- 
nent of neutrino incoming direction by using the BNN. One thousand iterations, of 
twenty MCMC steps each, are used in the paper. The neural network parameters 
are stored after each iteration, since the correlation between adjacent steps is very 
high. That is, the points in neural network parameter space are saved to lessen 
the correlation after twenty steps. It is also necessary to discard the initial part 
of the Markov chain because the correlation between the initial point of the chain 



and the points of the part is very high. The initial three hundred iterations are 
discarded in the paper. 

Certainly, the y,z components of the rh are obtained in the same method, if 
only i=y,z, respectively. Here L is defined as the unit vector of the m predicted 
by the BNNs for each event in the test sample. We can also define the direction 
q as the average of the unit direction vectors predicted by the BNNs in the same 
way as the section 6. That is 

^=J^T.L. (12) 

The g is just the neutrino incoming direction predicted by the BNNs. The uncer- 
tainty in this value is evaluated in the same method as the section 6. We can know 
the r.m.s. value of the distribution of the projection of the unit direction vectors 
predicted by the BNNs in the same method as the Ref.[l]. From the central limit 
theorem q follows that the distributions of its three components are Gaussian with 
a = r.m.s. /\/N centered at (0,0, Therefore, the uncertainty on the measure- 
ment of the neutrino direction can be given as the cone around q which contains 
68.3% of the integral of this distribution. 



8 Results 

Fig. 3 shows the distributions of the projections of the X^+n in the sec. 6 and 
the L predicted by the method of BNN along the reactor neutrino incoming direc- 
tion. The r.m.s. attainable by using BNN is only about 0.41, and less than that 
attainable by using the method in the Ref.[l]. The results of the determination 
of the reactor neutrino incoming direction using the method in the Ref.[l] and 
the method of BNN are shown in Table 1. The imcertainty attainable by using 
the method in the Ref.[l] is 21.1°, and the one attainable by using BNN is 1.0°. 
Fig. 4 shows the distributions of the projections of the Xe+n in the sec. 6 and 

— * 

the L predicted by the method of BNN along the supernova neutrino incoming 
direction. The r.m.s. attainable by using BNN is also about 0.35. The results of 
the determination of the supernova neutrino incoming direction using the method 
in the Ref.[l] and the method of BNN are shown in Table 2. The uncertainty 
attainable by using the method in the Ref.[l] is 10.7°, and the one attainable by 
using BNN is 0.6°. 

So compared to the method in Ref.[l], the uncertainty attainable by using BNN 
is significantly improved and reduces by a factor of about 20 (21° compared to 1° 
in the case of reactor neutrinos and 11° compared to 0.6° in the case of supernova 
neutrinos). And compared to SK, it reduces by a factor of about 8 (5° compared 
to 0.6°). Why such good results can be obtained with BNN? First, neutrino direc- 
tions obtained with the method in the Ref.[l] are used as inputs to BNN, that is 
such good results obtained with BNN is on the base of the results of the method in 
the Ref.[l]; Second, BNN can extract some unknown information from its inputs 
and discover more general relationships in data than traditional statistical meth- 
ods; Third, the over-fitting problem can be solved by using Bayesian methods to 
control model complexity. So results obtained with BNN can be much better than 
that of the method in the Ref.[l]. In a word, the method of BNN can be well 
applied to determine neutrino incoming direction in reactor neutrino experiments 
and supernova explosion location by scintillator detectors. 
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Tab. 1: Measurement of reactor neutrino direction 





IpI or 1^ 





9 


uncertainty 


known neutrino incoming direction 




60° 


120° 




Direction determined by the method in Ref.[l] 


0.033 


42.5° 


111.4° 


21.1° 


Direction determined by BNN 


0.708 


56.7° 


118.9° 


1.0° 



Tab. 2: Measurement of supernova neutrino c 


irection 




\p\ or \q\ 
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uncertainty 


known neutrino incoming direction 




60° 


135° 




Direction determined by the method in Ref.[l] 


0.066 


61.0° 


149.2° 


10.7° 


Direction determined by BNN 


0.727 


55.8° 


138.5° 


0.6° 
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Fig. 1: The reactor neutrino events for the Monte-Carlo simulation of the toy de- 
tector are uniformly generated throughout Gd-LS region, (a) is the distri- 
bution of the positron energy; (b) is the distribution of the energy of the 
neutron captured by Gd; (c) is the distribution of the distance between the 
positron and neutron positions; (d) is the distribution of the delay time of 
the neutron signal. 
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Fig. 2: The supernova neutrino events for the Monte-Carlo simulation of a liquid 
scintillator detector with the same geometry and the same target as the toy 
detector in the sec. 3 are uniformly generated throughout Gd-LS region, 
(a) is the distribution of the positron energy; (b) is the distribution of the 
energy of the neutron captured by Gd; (c) is the distribution of the distance 
between the positron and neutron positions; (d) is the distribution of the 
delay time of the neutron signal. 
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Fig. 3: The distributions of the projections of the Xe+n in the sec. 6 and the 
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L predicted by the method of BNN along the reactor neutrino incoming 
direction. 
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Fig. 4: The distributions of the projections of the Xg+n in the sec. 6 and the L 
predicted by the method of BNN along the supernova neutrino incoming 
direction. 



